home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
C/C++ Interactive Reference Guide
/
C-C++ Interactive Reference Guide.iso
/
c_ref
/
csource5
/
363_01
/
asm68020.doc
< prev
next >
Wrap
Text File
|
1991-09-26
|
29KB
|
730 lines
68020
CROSSASSEMBLER
Version 1.0
Autumn 1991
by
Andrew E. Romer
The information in this document has been carefully checked and is
believed to be entirely reliable. No responsibility, however, is assumed for
inaccuracies.
Index
1. Introduction
2. Assembler command line
3. Source file format
3.1. Empty statement
3.2. Label statement
3.3. Comment statement
3.4. Full statement
4. Statement fields
4.1. Label field
4.2. Operation field
4.3. Operand field
4.4. Comment field
5. Character set
6. Microprocessor instructions
7. Assembler directives
7.1. END
7.2. ORG
7.3. EQU
7.4. SET
7.5. REG
7.6. DC - define constant
7.7. DCB - define constant block
7.8. DS - define storage
8. Expressions
8.1. Numeric values
8.2. Order of precedence
8.3. Operators
9. Assembler processing
10. Addressing modes
10.1. Size
10.2. QUICK instructions
11. Assembler output
12. Assembly listing
13. Error reporting
14. References
1. Introduction.
The 68020 Cross-Assembler is an IBM PC (or compatible) program that
processes source program statements written in the 68020 Assembly Language and
produces machine-readable binary code.
This Assembler is an upgrade of the 68000 Cross-Assembler, Ref. 2; it has
been designed to conform to the format defined by Motorola (Ref. 1).
2. Assembler command line
The Assembler is invoked by the following command line, entered at the DOS
prompt:
asm [-switches] filename
where 'filename' is the name of the source file. If 'filename' has no
extension then '.asm' extension is assumed by default. Otherwise the specified
extension is used.
The following optional switches are valid:
c - Generates full hex code listing. Only one line of listing is
generated by default. This switch has no effect if the -l
switch is not invoked.
h - Displays a brief help message on the screen. No other
switches have any effect if this switch is invoked, and
'filename' is not processed.
l - Enables assembly listing file 'filename.lis' generation.
No assembly listing file is generated by default.
n - Disables the target code generation. The target code
'filename.h68'is generated by default.
The switches can be entered, in arbitrary sequence, without intervening
spaces, after the '-' character, or each switch can be entered following a
separate '-', in which case separating spaces are required. Therefore the
following examples are valid:
asm -lc filename
asm -cl filename
asm -l -c filename
The help message can also be displayed by entering:
- asm
(without arguments), or
- asm ?
at the DOS prompt.
3. Source file format
An assembler source file is an ASCII file which contains a sequence of
source statements. The first source statement begins with the first character
of the source file and is terminated by the 'newline character', NL; the
statements following the first are delimited by NL's.
Under DOS NL is a sequence of two characters: Carriage Return CR
(hexadecimal 0D) followed by Line Feed LF (hexadecimal 0A). The characters
contained between, but excluding, the NL delimiters are the statement context.
3.1. Empty statement
A source statement whose context consists exclusively of white space
(blanks and horizontal tabs) is called an empty statement. Empty statements do
not generate target code, but are included in the assembly listing.
3.2. Label statement
A label statement consists of a 'valid first label-character', optionally
followed by a sequence of 'valid subsequent label-characters', followed by a
colon ':', and, optionally, by white space. Valid first, and subsequent, label
characters are defined in the paragraph "label field" below.
No white space may precede the label.
3.3. Comment statement
A statement beginning with the asterisk ('*') is a comment statement.
Comment statements do not generate target code, but their context is included
in the assembly listing. Comments can also be included in the comment field of
a full statement.
3.4. Full statement
A full statement consists of up to 4 fields: label field, operation field,
operand field, and comment field. The fields are separated by white space.
The label field is optional as a rule. The exceptions from this rule are
listed in 7.
The operation field is obligatory. The operand field is not always
required, and if it is not required then the characters entered in the operand
field are regarded as belonging to the comment field. The comment field is
optional.
4. Statement fields
4.1. Label field
If the first character of a statement line is a white-space character then
the label field is empty. Otheriwse it must be a 'valid first label-character',
optionally followed by 'valid subsequent label-characters'.
Valid first label-character may be any of the following:
- letters of the alphabet A...Z, a...z
- the underscore _
- the full stop .
and valid subsequent label-characters may be:
- letters of the alphabet A...Z, a...z
- the numerals 0...9
- the underscore _
Upper and lower case letters are not distinct, i.e. name, NAME or nAMe are
regarded as identical. Only the first eight characters are significant, i.e.
longname, longnameone, longname123 are regarded as identical, but will be
passed to the assembly listing unchanged, as will the spelling using upper and
lower cases.
Using labels differing in the characters beyond the eighth, and labels
using different case spelling, is not recommended as it makes the source code
more difficult to understand.
Certain operations require the label field to be present, and some require
it not to be present (7. Assembler directives). If the source line contravenes
either of these requirements then an error message, or warning message, is
generated by the assembler (see 13. Error reporting).
Labels associated with an opcode become equal to the value of the
assembler program counter at the time when the source line is read; those
associated with directives are defined by the directive itself.
The uses of a label include: a symbol in an expression, an address
pointer.
4.2. Operation field
An operation field is always required in a full statement. An operation,
represented by the operation mnemonic, can either be a microprocessor
instruction, or an assembler directive. An instruction, together with its
operand, if required, will cause the assembler to generate a corresponding
binary operation code (opcode) that can be acted upon by the microprocessor.
The opcode generated is entered as a sequence of hexadecimal digits in the
target and listing files.
An assembler directive generates no opcode, it instructs the assembler to
follow a specified course of action instead.
4.3. Operand field
If the opcode or directive requires an operand then the field immediately
following the operation field is the operand field, otherwise it is the
comment field. The operand field format depends on the operation it follows.
For microprocessor instructions (opcodes) the operand format will be found in
Ref. 1, for directives it will be defined together with the directive
definition in this manual, see 7. Assembler directives.
If an operand is absent, and it is required for an operation, then an error
message is issued.
4.4. Comment field
The comment field is optional. It generates no code, but is passed to the
assembly listing unchanged. It can therefore be used to explain the meaning
of the operation it is attached to. Skilful use of comments, whether in
comment statements or in comment fields, is an important part of a good
programming practice.
5. Character set
Except in character strings delimited by single quotes, the assembler does
not distinguish between upper and lower cases of letters of the alphabet.
All printable characters are recognized by the assembler in quoted strings
of characters. In these the single quote character is represented by
repetition, to distinguish a single quote from the string terminator: 'It''s a
string' is read by the assembler as: It's a string.
Characters valid in the label field have been defined in the field's
description.
The following characters are valid in the operation field:
- the letters of the alphabet,
- the full stop '.',
- the characters '[', ']', ':' (in bit field operations),
The operand field recognizes:
- the letters of the alphabet,
- the decimal digits,
- the numeric base designator prefix characters: $, @, %,
- the ASCII constant delimiter: ',
- the arithmetic operands: +, -, *, /, \,
- the Boolean operators: & (AND), ! (OR), ~ (NOT),
- shift operators: <<, >>
- the special characters: , (comma), : (colon), . (full stop), and the
brackets (, ), [, ], {, },
All printable characters are recognized in comments.
6. Microprocessor instructions.
All mnemonics defined by Motorola are recognized, as are the size
specifiers:
.b - byte (8 bits),
.s - short (8 bits),
.w - word (16 bits),
.l - long word (32 bits).
.b, .w, and .l are used with operations other than branch operations,
.s, .w, and .l with branch operations.
It is beyond the scope of this manual to describe the details of
mnemonics, the full description can be found in Ref. 1.
If no size specifier is present then WORD size '.w' is assumed by default.
In branch operations the size of the operation is calculated by the assembler
and the smallest size necessary is used, provided that the destination of the
branch operation is known when the operation is processed.
If the destination is not known then a 16-bit branch is assumed by default.
If the assembler finds later that an 8-bit branch would be sufficient then a
warning is issued; if the long branch is found necessary then the operation is
flagged as an error.
7. Assembler Directives
7.1. END
The END directive indicates the end of the source file, any source lines
following this directive will be ignored by the assembler. This directive does
not require an operand and must not have a label.
The use of the END directive is optional; if it is not present the
assembler will process source lines until the end of the source file.
7.2. ORG
Format:
[<label>] ORG <expression>
The ORG (origin) directive resets the assembler's program counter to the
directive's operand. The operand can be any valid arithmetic expression.
The directive may have, but does not require, a label. The label, if
present, becomes equal to the directive's operand.
7.3. EQU
Format:
<label> EQU <expression>
The EQU (equate) directive equates the value of the obligatory label to
the directive's operand. The operand can be any valid arithmetic expression
and must be defined before the point at which the EQU directive appears. The
label defined by the EQU directive must not be redefined.
7.4. SET
Format:
<label> SET <expression>
The SET directive equates the value of the obligatory label to the
directive's operand. The operand can be any valid arithmetic expression and
must be defined before the point at which the SET directive appears. The
label defined by the SET directive, unlike the EQU directive, may be
redefined.
7.5. REG
Format:
<label> REG <register range>[/<register range>...]
The REG directive equates the obligatory label to the register list, to be
handled as a single operand by subsequent MOVEM instructions.
The 'register range' operand can either be a single register, <Dn> or <An>,
or a range of registers <Dn-Dm>, <An-Am>. The registers and ranges may be
specified in any order, thus all the following are identical:
D1/D2/D3/A0/A1/A2/A3, D1-D3/A0-A3, A3-A0/D3-RD
7.6. DC - define constant
Format:
[<label>] DC[.<size>] <item>[,<item>...]
This directive will cause aa appropriate number of memory locations to be
initialized to the values specified by the consecutive item operands. The
optional label is equated to the address of the start of the block, and the
size parameter defines the number of bytes allocated for each item. The item
argument is an expression defining the value to be placed in the corresponding
memory locations.
7.7. DCB - define constant block
Format:
[<label>] DCB[.<size>] <length>,<value>
The optional label is equated to the address of the start of the block.
The size code specifies a block of bytes (size code .B), words (.W) or long
words (.L). If the size code is omitted then word size is assumed.
The length argument can be any non-negative expression, it defines the
number of elements in the block. It has to be defined before the statement
where the DCB directive appears.
The value argument is an expression defining the value to be placed in
each element of the block; it does not have to befined beforehand.
Word and long word sized elements are placed starting at a word boundary,
the assembler will increment the location counter by one if necessary. On the
other hand, a block of byte sized elements will start at any location, but the
following instruction will be placed at a word boundary, again by incrementing
the location counter if necessary.
7.8. DS - define storage
Format:
[<label>] DS[.<size>] <length>
The optional label is equated to the address of the start of the block.
The size code specifies a block of bytes (size code .B), words (.W) or long
words (.L). If the size code is omitted then word size is assumed.
The length argument can be any non-negative expression, it defines the
number of elements in the block. It has to be defined before the statement
where the DCB directive appears.
Word and long word sized elements are placed starting at a word boundary,
the assembler will increment the location counter by one if necessary. On the
other hand, a block of byte sized elements will start at any location, but the
following instruction will be placed at a word boundary, again by incrementing
the location counter if necessary.
The memory block reserved for storage by the DS directive is not
initialized.
8. Expressions
An expression is a sequence of numeric values and operators.
8.1 Numeric values
The numeric values can be entered as symbols (4.1.), or as explicit values
in a decimal, octal, binary, or hexadecimal base.
A sequence of decimal digits is assumed to be a decimal value. Explicit
values in bases other than decimal are denoted by preceding the sequence of
digits with the base designator:
binary %
octal @
hexadecimal $
The usual conventions apply as regards the digits used in non-decimal
bases:
binary digits 0, 1
octal 0, 1, 2, 3, 4, 5, 6, 7
hexadecimal 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f
or 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
Examples:
Valid numeric values are: decimal: 0, 5, 142857
binary: %0, %101, %100010111000001001
octal: @0, @5, @427011
hexadecimal: $0, $5, $3fe, $FF7B
All values in an expression, both intermediate and final, are limited to
32 bits. Larger values, whether entered or generated, are reduced modulo (2 to
the power of 32) = $100000000.
8.2. Order of precedence
Expressions are evaluated in the usual order of precedence: the
parenthesized sub-expressions are evaluated first, then the expression is
evaluated in the order of precedence of the operators. Precedence of operators
is listed below.
8.3. Operators
The operators recognized by the assembler are listed below, in groups of
precedence. Highest precedence is listed first:
1. Unary minus -,
Bitwise negation ~ (one's complement),
2. Left shift << (a<<b results in a shifted b bits, zero filled),
Right shift >> (a>>b results in a shifted b bits, zero filled),
3. Bitwise AND &,
Bitwise OR !
4. Multiplication *,
Division / (truncated integer division, i.e. 5/3=1),
Remainder \ (5\3=2),
5. Addition +,
Subtraction -.
Operators of the same precedence are evaluated left to right.
9. Assembler processing
The assembler processes the source file in two passes.
In the first pass:
- each label string of characters is stored in the label table, together
with its value stored when the definition of the label is encountered,
- the number of target code bytes is established for each source line.
When the number of target code bytes depends on the operand size, and the
operand size is unknown at the time it is encountered (e.g. the operand uses
a label which has not been defined yet), then the most pessimistic assumption
(i.e. 32-bit size) is made, except for the branch operations. Branch operations
of unknown size are assumed to be 16-bit branches.
At the second pass the required output (target code, assembly listing) is
generated.
The assembler will make assumptions in some other cases. For instance MOVE
mnemonic will be processed as MOVEA if the operand is an address, or as MOVEQ
if the source operand is an immediate value not exceeding 8 bits and the target
operand is a data register.
10. Addressing modes
All addressing modes of the effective address as specified in Ref. 1. are
supported. The following symbols are used to describe the operand formats:
Dn = Data Register
An = Address Register
SP = A7
Xn = Data or Address register used as Index register
size = size code (B, W or L)
scale = scale in indexed addressing (1, 2, 4 or 8)
d8 = 8-bit displacement
d16 = 16-bit displacement
bd = base displacement (16- or 32-bit)
od = outer displacement (16- or 32-bit)
ex16 = Expression that evaluates to a 16-bit value
ex = Any expression
PC = Program Counter
The following register names may be also used as operands in certain
instructions (e.g.: MOVEC to CCR):
SR = Status Register
CCR = Condition Code Register
USP = User Stack Pointer
SFC = Source Function Code Register
DFC = Destination Function Code Register
VBR = Vector Base Register
CACR = Cache Control Register
CAAR = Cache Address Register
MSP = Master Stack Pointer
ISP = Interrupt Stack Pointer
Effective Address Modes
Mode Assembler Format
--------------------------------------------- ----------------
Data Register Direct Dn
Address Register Direct An
Address Register Indirect (An)
Address Register Indirect with Postincrement (An)+
Address Register Indirect with Predecrement -(An)
Address Register Indirect with Displacement (d16,An)
Address Register Indirect with Index
(8-bit displacement) (d8,An,Xn.size*scale)
Address Register Indirect with Index
(base displacement) (bd,An,Xn.size*scale)
Memory Indirect Post-Indexed ([bd,An],Xn.size*scale,od)
Memory Indirect Pre-Indexed ([bd,An,Xn.size*scale],od)
Program Counter Indirect with Displacement (d16,PC)
Program Counter Indirect with Index
(8-bit displacement) (d8,PC,Xn.size*scale)
Program Counter Indirect with Index
(base displacement) (bd,PC,Xn.size*scale)
Program Counter Memory Indirect Post-Indexed ([bd,PC],Xn.size*scale,od)
Program Counter Memory Indirect Pre-Indexed ([bd,PC,Xn.size*scale],od)
Absolute Long Address (ex).L
Absolute Short Address (ex16).W
Immediate Data #ex
In all non-indexed modes SP can be used in place of A7.
The following special cases of the above addressing modes, while legal, are not
supported:
- Memory Indirect with all elements suppressed ([],,) and ([,],)
- Memory Indirect without index ([An],,) and ([An,],)
- Memory Indirect without base register ([],Xn,) and ([,Xn],)
- PC Memory Indirect with all elements suppressed ([ZPC],,) and ([ZPC,],)
- PC Memory Indirect without index ([PC],,) and ([PC,],)
- PC Memory Indirect without base register ([ZPC],Xn,) and ([ZPC,Xn],)
10.1. Size
Many instructions, and directives, can have their size specified. The
size specification is coded by appending .B for 8-bit, W for 12-bit, and .L
for 32-bit size. The only exception is the set of the branch instructions
where the three sizes are coded as .S, .W, and .L, respectively.
When the size specification is omitted, the assembler tries to make a
guess: if the value of the expression is known at the first pass the assembler
will allocate size as appropriate, otherwise it will allocate the 32-bit size
as the most pessimistic assumption. Again the branch instructions are handled
differently - 16-bit branches are assumed by default.
At the second pass all expression values are known and the assembler will
flag excessive branch sizes as warnings, and 32-bit branches with 16-bit sizes
as errors.
Not all microprocessor instructions accept all three sizes. Refer to
Ref. 1. for details.
10.2. QUICK instructions
The three "quick" instructions: ADDQ, MOVEQ, and SUBQ are automatically
selected by the assembler when the addressing mode and operand value conform
to the "quick" version requirements:
move.l #<data>,Dn
will be coded as
moveq #<data>,Dn
if <data> is known at the first pass and does not exceed 8 bits, and similarly
add[.size] #<data>,<ea>
sub[.size] #<data>,<ea>
will be coded as
addq #<data>,<ea>
subq #<data>,<ea>
if <data> is known at the first pass and falls within the range of 1 to 8..
The MOVEQ instruction will accept both signed and unsigned operands, i.e.
<data> within -128 and 255. Naturally, negative operands are coded as
corresponding positive operands above 127, according to the rules of "two's
complement", e.g. decimal values -55 and +201 will both be coded as hexadecimal
C9.
11. Assembler output
The target code is stored in the file bearing the same file name as the
source file, but with the extension changed to .H68. The code is stored using
the Motorola S-record format.
12. Assembly listing
The assembly listing is stored in a file bearing the same name as the
source file, but with the extension changed to .LIS. The listing format is
as follows: each line starts with a 5-digit decimal line number, generated
automatically by the assembler, followed by the 8-digit hexadecimal program
counter value. This in turn is followed by the hexadecimal opcode and its
argument, after which the source line is appended.
The listing is formatted. Whatever the size of the source file fields, the
listing allocates the following field sizes, listed in the order of their
appearance in the listing line:
line number - 6 characters
program counter - 9 characters
hex opcode and argument - 35 characters
label field - 10 characters
opcode mnemonic - 8 characters
operand - 16 characters
comment - remainder of line.
Except the line number and the program counter fields, each field's content
can exceed its allocated size. If hexadecimal opcode and argument exceed their
allocation, then only part of the code is shown, with an ellipsis ("...")
appended to indicate that the listing is not complete. To show all code it is
necessary to invoke the assembler with the -c option. In this case the
remainder of the code is printed in the following line.
The fields following the line number and and the progrem counter use up as
much space as required, with just one blank separating each field from the
next.
Obviously it is not convenient to have lines overflowing the printer's
line length. In most cases if the source line does not exceed 80 characters
then the resulting listing line does not exceed 132 characters. Most printers
can accommodate this line length, at least in condensed mode.
13. Error reporting
As the assembler processes each source statement it records any errors
found. An error is classed according to its severity and recorded in an error
buffer. If more than one error is found then the error in the buffer is
replaced by the newly found one, provided, that the new error's severity is
higher; otherwise the new error is ignored. When the source statement
processing is complete, the error in the buffer is reported, both in the
assembly listing file in the line following the statement it refers to, and on
the screen.
There are four severity classes:
- severe errors,
- errors,
- minor errors,
- warnings.
The format of an assembly listing error message is:
ERROR: <error message>
for severe errors, errors, and minor errors, or
WARNING: <error message>
for warnings. The error message appearing on the screen is preceded by:
in line <line number>:
There are the following error messages, listed according to severity:
Severe errors:
- Invalid syntax
- Invalid opcode
- Invalid addressing mode
- Label required with this directive
- Symbol value differs between first and second pass
- This code is not implemented
- Short branch to the immediately following instruction is not allowed
Errors:
- Undefined symbol
- Division by zero attempted
- Symbol multiply defined
- Register list multiply defined
- Register list symbol not previously defined
- Forward references not allowed with this directive
- Block length is less that zero
Minor errors:
- Invalid size code
- Invalid vector number
- Branch instruction displacement is out of range or invalid
- Displacement out of range
- Absolute address exceeds 16 bits
- Immediate data exceeds 3 bits
- Immediate data exceeds 8 bits
- Immediate data exceeds 16 bits
- Origin value is odd, location counter set to next higher address
- The symbol specified is not a register list symbol
- Register list symbol used in an expression
- Invalid constant shift count
- Invalid label character
Warnings:
- ASCII constant exceeds 4 characters
- Numeric constant exceeds 32 bits
- Evaluation of expression could not be completed
- Excessive size
- Unsized instruction, size ignored
- Invalid or illegal size ignored
- Invalid size, corrected
- MOVEQ instruction constant exceeds 8 bits,
Least significant 8 bits used
- No message defined
The last message is included as a safety measure. It will intercept an
illegal internal error code if it were generated because of a malfunction of
the assembler.
There are also a number of error messages potentially generated by various
functions of the assembler. These are included to aid assembler modifications
and should only be visible if changes are made to the assembler.
14. References
1. Motorola "MC68020 Users's Manual", MC68020UM/AD REV 1, Second Edition.
2. 68000 Assembler, Version 1.0, written by P. McKee at North Carolina State
University, Electrical and Computer Engineering department,
released to Public Domain by M. Shaban, 8/4/89.